Search CORE

10 research outputs found

Dynamic reconfiguration frameworks for high-performance reliable real-time reconfigurable computing

Author: Adetomi Adewale Akinlawon
Publication venue: The University of Edinburgh
Publication date: 03/07/2019
Field of study

The sheer hardware-based computational performance and programming flexibility offered by reconfigurable hardware like Field-Programmable Gate Arrays (FPGAs) make them attractive for computing in applications that require high performance, availability, reliability, real-time processing, and high efficiency. Fueled by fabrication process scaling, modern reconfigurable devices come with ever greater quantities of on-chip resources, allowing a more complex variety of applications to be developed. Thus, the trend is that technology giants like Microsoft, Amazon, and Baidu now embrace reconfigurable computing devices likes FPGAs to meet their critical computing needs. In addition, the capability to autonomously reprogramme these devices in the field is being exploited for reliability in application domains like aerospace, defence, military, and nuclear power stations. In such applications, real-time computing is important and is often a necessity for reliability. As such, applications and algorithms resident on these devices must be implemented with sufficient considerations for real-time processing and reliability. Often, to manage a reconfigurable hardware device as a computing platform for a multiplicity of homogenous and heterogeneous tasks, reconfigurable operating systems (ROSes) have been proposed to give a software look to hardware-based computation. The key requirements of a ROS include partitioning, task scheduling and allocation, task configuration or loading, and inter-task communication and synchronization. Existing ROSes have met these requirements to varied extents. However, they are limited in reliability, especially regarding the flexibility of placing the hardware circuits of tasks on device’s chip area, the problem arising more from the partitioning approaches used. Indeed, this problem is deeply rooted in the static nature of the on-chip inter-communication among tasks, hampering the flexibility of runtime task relocation for reliability. This thesis proposes the enabling frameworks for reliable, available, real-time, efficient, secure, and high-performance reconfigurable computing by providing techniques and mechanisms for reliable runtime reconfiguration, and dynamic inter-circuit communication and synchronization for circuits on reconfigurable hardware. This work provides task configuration infrastructures for reliable reconfigurable computing. Key features, especially reliability-enabling functionalities, which have been given little or no attention in state-of-the-art are implemented. These features include internal register read and write for device diagnosis; configuration operation abort mechanism, and tightly integrated selective-area scanning, which aims to optimize access to the device’s reconfiguration port for both task loading and error mitigation. In addition, this thesis proposes a novel reliability-aware inter-task communication framework that exploits the availability of dedicated clocking infrastructures in a typical FPGA to provide inter-task communication and synchronization. The clock buffers and networks of an FPGA use dedicated routing resources, which are distinct from the general routing resources. As such, deploying these dedicated resources for communication sidesteps the restriction of static routes and allows a better relocation of circuits for reliability purposes. For evaluation, a case study that uses a NASA/JPL spectrometer data processing application is employed to demonstrate the improved reliability brought about by the implemented configuration controller and the reliability-aware dynamic communication infrastructure. It is observed that up to 74% time saving can be achieved for selective-area error mitigation when compared to state-of-the-art vendor implementations. Moreover, an improvement in overall system reliability is observed when the proposed dynamic communication scheme is deployed in the data processing application. Finally, one area of reconfigurable computing that has received insufficient attention is security. Meanwhile, considering the nature of applications which now turn to reconfigurable computing for accelerating compute-intensive processes, a high premium is now placed on security, not only of the device but also of the applications, from loading to runtime execution. To address security concerns, a novel secure and efficient task configuration technique for task relocation is also investigated, providing configuration time savings of up to 32% or 83%, depending on the device; and resource usage savings in excess of 90% compared to state-of-the-art

Edinburgh Research Archive

A Functionality-Based Runtime Relocation System for Circuits on Heterogeneous FPGAs

Author: Adetomi Adewale
Arslan Tughrul
Enemali Godwin ilemona
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2018
Field of study

Crossref

Edinburgh Research Explorer

Proxy Circuits for Fault-Tolerant Primitive Interfacing in Reconfigurable Devices Targeting Extreme Environments

Author: Adetomi Adewale
Arslan Tughrul
McDonald-Maier Klaus
Saha Sangeet
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/09/2020
Field of study

Continuous interface access to device-level primitives in reconfigurable devices in extreme environments is key to reliable operation. However, it is possible for a primitive's interface controller, which is static to be rendered non-operational by a permanent damage in the controller's circuitry. In order to mitigate this, this paper proposes the use of relocatable proxy circuits to provide remote interfacing capability to primitives from anywhere on a reconfigurable device. A demonstration with device register read controller shows that an improvement in fault-tolerance can be achieved

University of Essex Research Repository

Enabling Dynamic Communication for Runtime Circuit Relocation

Author: Adetomi Adewale
Arslan Tughrul
Enemali Godwin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Edinburgh Research Explorer

ResearchOnline@GCU

Unlocking the Potential of Two-Point Cells for Energy-Efficient and Resilient Training of Deep Nets

Author: Adeel Ahsan
Adetomi Adewale
Ahmed Khubaib
Arslan Tughrul
Hussain Amir
Phillips W. A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2023
Field of study

Edinburgh Research Explorer

A dynamic partial reconfiguration design for camera systems

Author: Adetomi Adewale
Arslan Tughrul
Ebrahim Ali
Khalifat Jalal
Publication venue
Publication date: 31/05/2015
Field of study

The image-processing pipeline is the core part of any camera system including digital still cameras, camcorders, camera phones and video surveillance equipments. The image-processing pipeline consists of a number of processing stages that enhance the image or remove any effects that are caused by surrounding conditions. These stages are computationally intensive and need special requirements to meet the real time processing. This paper discusses the pipeline parts and presents a high-performance and cost-effective implementation of the pipeline on Field Programmable Gate Arrays (FPGAs) using Dynamic Partial Reconfiguration (DPR) feature to exploit the FPGA resources over time and space. The paper shows that the implemented system adds much of flexibility to camera systems by using a reconfigurable region. The system can use an unlimited number of image processing pipeline stages to process the images without the need of huge number of logic resources to fit all the stages. Moreover, the stages are not fixed in this system, they can be changed upon the user's decision. The architecture is designed to process still images of size 1920×1080. Each stage could process a full frame within 7.25 ms. A fast configuration engine is designed and deployed in the system. The engine shows that it can outperform the engine provided with zynq SoC by three times. The overall throughput of the system reaches 250 Megapixel/s

Edinburgh Research Explorer

Scipedia

EnSuRe: Energy & Accuracy Aware Fault-tolerant Scheduling on Real-time Heterogeneous Systems

Author: Adetomi Adewale
Arslan Tughrul
Ehsan Shoaib
Kasap Server
McDonald-Maier Klaus
Saha Sangeet
Zhai Xiaojun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/06/2021
Field of study

This paper proposes an energy efficient real-time scheduling strategy called EnSuRe, which (i) executes real-time tasks on low power consuming primary processors to enhance the system accuracy by maintaining the deadline and (ii) provides reliability against a fixed number of transient faults by selectively executing backup tasks on high power consuming backup processor. Simulation results reveal that EnSuRe consumes nearly 25% less energy, compared to existing techniques, while satisfying the fault tolerance requirements. EnSuRe is also able to achieve 75% system accuracy with 50% system utilisation. Further, the obtained simulation outcomes are validated on benchmark tasks via a fault injection framework on Xilinx ZYNQ APSoC heterogeneous dual core platform

University of Essex Research Repository

Southampton (e-Prints Soton)

Coventry University Pure Portal

Low-Power Ultra-Small Edge AI Accelerators for Image Recognition with Convolution Neural Networks: Analysis and Future Directions

Author: Adewale Adetomi
Tughrul Arslan
Weison Lin
Publication venue: 'MDPI AG'
Publication date: 01/08/2021
Field of study

Edge AI accelerators have been emerging as a solution for near customers’ applications in areas such as unmanned aerial vehicles (UAVs), image recognition sensors, wearable devices, robotics, and remote sensing satellites. These applications require meeting performance targets and resilience constraints due to the limited device area and hostile environments for operation. Numerous research articles have proposed the edge AI accelerator for satisfying the applications, but not all include full specifications. Most of them tend to compare the architecture with other existing CPUs, GPUs, or other reference research, which implies that the performance exposé of the articles are not comprehensive. Thus, this work lists the essential specifications of prior art edge AI accelerators and the CGRA accelerators during the past few years to define and evaluate the low power ultra-small edge AI accelerators. The actual performance, implementation, and productized examples of edge AI accelerators are released in this paper. We introduce the evaluation results showing the edge AI accelerator design trend about key performance metrics to guide designers. Last but not least, we give out the prospect of developing edge AI’s existing and future directions and trends, which will involve other technologies for future challenging constraints

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

A functionality-based runtime relocation system for circuits on heterogeneous FPGAs

Author: Adetomi Adewale
Arslan Tughrul
Enemali Godwin
Seetharaman Gopalakrishnan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2018
Field of study

ResearchOnline@GCU

Unlocking the Potential of Two-Point Cells for Energy-Efficient and Resilient Training of Deep Nets

Author: Adeel Ahsan
Adetomi Adewale
Ahmed Khubaib
Arslan Tughrul
Hussain Amir
Phillips W. A.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/06/2023
Field of study

Context-sensitive two-point layer 5 pyramidal cells (L5PCs) were discovered as long ago as 1999. However, the potential of this discovery to provide useful neural computation has yet to be demonstrated. Here we show for the first time how a transformative L5PCs-driven deep neural network (DNN), termed the multisensory cooperative computing (MCC) architecture, can effectively process large amounts of heterogeneous real-world audio-visual (AV) data, using far less energy compared to best available ‘point’ neuron-driven DNNs. A novel highly-distributed parallel implementation on a Xilinx UltraScale+ MPSoC device estimates energy savings up to 245759 × 50000 μ J (i.e., 62% less than the baseline model in a semi-supervised learning setup) where a single synapse consumes 8e−5μ J. In a supervised learning setup, the energy-saving can potentially reach up to 1250x less (per feedforward transmission) than the baseline model. The significantly reduced neural activity in MCC leads to inherently fast learning and resilience against sudden neural damage. This remarkable performance in pilot experiments demonstrates the embodied neuromorphic intelligence of our proposed cooperative L5PC that receives input from diverse neighbouring neurons as context to amplify the transmission of most salient and relevant information for onward transmission, from overwhelmingly large multimodal information utilised at the early stages of on-chip training. Our proposed approach opens new cross-disciplinary avenues for future on-chip DNN training implementations and posits a radical shift in current neuromorphic computing paradigms

Edinburgh Research Explorer

Repository@Napier